Hi! Today we will cover NSE and its veeeery weird nature. Are you ready? Me neither. SO LET’S DO IT

Our boys

Let’s meet our today’s villains. The three NSEketeers.

Scroll down to see el explanationi.

makeNseFunction1 <- function(fun) {
  function(data, elementOrFormula, ...) {
    functionEnvironment = environment()
    
    if (as.character(substitute(elementOrFormula)) %in% names(data)) {
      argument = data[[deparse(substitute(elementOrFormula))]]
    } else {
      allVariables = all.vars(elementOrFormula)
      for (variable in allVariables) {
        assign(variable, eval(as.name(variable), data), envir = functionEnvironment)
      }
      argument = elementOrFormula
      environment(argument) = functionEnvironment
    }
    fun(argument, ...)
  }
} 

makeNseFunction2 <- function(fun) {
  function(data, elementOrFormula, ...) {
    library(rlang)
    
    functionEnvironment = environment()

    if (as.character(substitute(elementOrFormula)) %in% names(data)) {
      elementOrFormula = substitute(elementOrFormula)
      argument = eval(enexpr(elementOrFormula), data)
    } else {
      allVariables = all.vars(elementOrFormula)
      for (variable in allVariables) {
        assign(variable, eval(as.name(variable), data))
      }
      argument = elementOrFormula
      environment(argument) = functionEnvironment
    }
    fun(argument, ...)
  }
}

makeNseFunction3 <- function(fun) {
  function(data, elementOrFormula, ...) {
    functionEnvironment = environment()
    
    if (as.character(substitute(elementOrFormula)) %in% names(data)) {
      argument = eval(substitute(elementOrFormula), data)
    } else {
      allVariables = all.vars(elementOrFormula)
      for (variable in allVariables) {
        assign(variable, eval(as.name(variable), data))
      }
      argument = elementOrFormula
      environment(argument) = functionEnvironment
    }
    fun(argument, ...)
  }
} 

Each of the functions implements a different way to do a simple thing.

CREATING AN NSE VERSION OF A GIVEN FUNCTION WHICH RETRIVES A FIELD USING NON-STANDARD EVALUATION WITH BUILT-IN R MECHNISMS OR EXTERNAL METHODS

Sounds evil?

Well it is. But it is quite easy to understand.

The thing is that in R you can get an element from a e.g. list like so:

myList = list(a = 1, b = 2, c = 3)
myList$a
## [1] 1

BUT. Have you ever wondered what actually is “a” here? The one in the second line.

Let’s check it:

a

… Actually I cannot check it, because my markdown wouldn’t compile.

I would get an error saying that “a” is not found. Well, it was not defined, so it should NOT be found.

How does it work in the first example then? Let’s call it R magic for now.

Another question:

What if I wanter to pass “a” as a parameter to a function and use the $ operator inside my function like so:

myList = list(a = 1, b = 2, c = 3)
myFunction = function(a, myList){
  myList$a
}

myFunction(a, myList)
## [1] 1

WOW, it works with no problems.

It’s R - it only looks like it works.

Let’s see the example where I want to get “b”:

myList = list(a = 1, b = 2, c = 3)
myFunction = function(a, myList){
  myList$a
}

myFunction(b, myList)
## [1] 1

Oops.

I am not going to go into detail, but the problem is related to passing the argument at it changing it’s metaparameters (or at least that’s how I understand it).

FIX IT

Long story short, to fix this error we got to do something like this:

myList = list(a = 1, b = 2, c = 3)
myFunction = function(a, myList){
  eval(substitute(a), myList)
}

myFunction(b, myList)
## [1] 2

What what what. What happend here?

In very very basic words we can say that:

“Function eval evaluates an expression using the given object (here myList)”

What about “substitute”?

It simple “retrives” the original variable name and uses it in the same way as in:

myList$a
## [1] 1

If you want a proper explanation I recommend checking those example: Examples

Or these explanations:

Explanation Explanation

I also recommend listening to Hadley Wickham’s 5 (actually 6) minute talk about “Tidy evaluation”. It really helped me NOT to kill myself in this process. Hope it helps you too.

Tidy evaluation in 5 mins

Getting back to our NSEketeers.

What you saw at the beginning is a few implementations which should create a NSE (NonStandard Evaluation) function from a non-NSE function.

We want this call:

min(myList$a)

to be equal to this call

min_NSE(myList, a)

They also take into account formulas but that would be too much to explain at once, so we’ll skip it.

We are here to check their time efficiency.

I also wanted to check their memory usage but… let’s see this Stack Overflow answer.

So onto the testing we go!

Efficiency testing

testedFunction = min
datasetSize = 100
dataset = list(a = sample(x = 1:100, size = datasetSize, replace = TRUE))

testedFunction = min
datasetSize = 10000
dataset = list(a = sample(x = 1:100, size = datasetSize, replace = TRUE))

testedFunction = max
datasetSize = 10000
dataset = list(a = sample(x = 1:100, size = datasetSize, replace = TRUE))

testedFunction = mean
datasetSize = 10000
dataset = list(a = sample(x = 1:100, size = datasetSize, replace = TRUE))

testedFunction = mean
datasetSize = 1000000
dataset = list(a = sample(x = 1:100, size = datasetSize, replace = TRUE))

testedFunction = mean
datasetSize = 1000000
dataset = list(a = sample(x = 1:100, size = datasetSize, replace = TRUE))

testedFunction = lm
datasetSize = 100
dataset = list(x = sample(x = 1:100, size = datasetSize, replace = TRUE), y = sample(x = 1:100, size = datasetSize, replace = TRUE))
formula = x ~ y